11 research outputs found

    CONcreTEXT @ EVALITA2020: The Concreteness in Context Task

    Get PDF
    Focus of the CONcreTEXT task is conceptual concreteness: systems were solicited to compute a value expressing to what extent target concepts are concrete (i.e., more or less perceptually salient) within a given context of occurrence. To these ends, we have developed a new dataset which was annotated with concreteness ratings and used as gold standard in the evaluation of systems. Four teams participated in this first edition of the task, with a total of 15 runs submitted.Interestingly, these works extend information on conceptual concreteness available in existing (non contextual) norms derived from human judgments with new knowledge from recently developed neural architectures, in much the same multidisciplinary spirit whereby the CONcreTEXT task was organized

    Linguistic profile automated characterisation in pluripotential clinical high-risk mental state (CHARMS) conditions: methodology of a multicentre observational study

    Get PDF
    Introduction: Language is usually considered the social vehicle of thought in intersubjective communications. However, the relationship between language and high- order cognition seems to evade this canonical and unidirectional description (ie, the notion of language as a simple means of thought communication). In recent years, clinical high at-risk mental state (CHARMS) criteria (evolved from the Ultra-High-Risk paradigm) and the introduction of the Clinical Staging system have been proposed to address the dynamicity of early psychopathology. At the same time, natural language processing (NLP) techniques have greatly evolved and have been successfully applied to investigate different neuropsychiatric conditions. The combination of at-risk mental state paradigm, clinical staging system and automated NLP methods, the latter applied on spoken language transcripts, could represent a useful and convenient approach to the problem of early psychopathological distress within a transdiagnostic risk paradigm. Methods and analysis: Help-seeking young people presenting psychological distress (CHARMS+/− and Clinical Stage 1a or 1b; target sample size for both groups n=90) will be assessed through several psychometric tools and multiple speech analyses during an observational period of 1-year, in the context of an Italian multicentric study. Subjects will be enrolled in different contexts: Department of Neuroscience, Rehabilitation, Ophthalmology, Genetics, Maternal and Child Health (DINOGMI), Section of Psychiatry, University of Genoa—IRCCS Ospedale Policlinico San Martino, Genoa, Italy; Mental Health Department—territorial mental services (ASL 3—Genoa), Genoa, Italy; and Mental Health Department—territorial mental services (AUSL—Piacenza), Piacenza, Italy. The conversion rate to full-blown psychopathology (CS 2) will be evaluated over 2 years of clinical observation, to further confirm the predictive and discriminative value of CHARMS criteria and to verify the possibility of enriching them with several linguistic features, derived from a fine-grained automated linguistic analysis of speech. Ethics and dissemination: The methodology described in this study adheres to ethical principles as formulated in the Declaration of Helsinki and is compatible with International Conference on Harmonization (ICH)-good clinical practice. The research protocol was reviewed and approved by two different ethics committees (CER Liguria approval code: 591/2020—id.10993; Comitato Etico dell’Area Vasta Emilia Nord approval code: 2022/0071963). Participants will provide their written informed consent prior to study enrolment and parental consent will be needed in the case of participants aged less than 18 years old. Experimental results will be carefully shared through publication in peer- reviewed journals, to ensure proper data reproducibility. Trial registration number DOI:10.17605/OSF.IO/BQZTN

    EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020

    Get PDF
    Welcome to EVALITA 2020! EVALITA is the evaluation campaign of Natural Language Processing and Speech Tools for Italian. EVALITA is an initiative of the Italian Association for Computational Linguistics (AILC, http://www.ai-lc.it) and it is endorsed by the Italian Association for Artificial Intelligence (AIxIA, http://www.aixia.it) and the Italian Association for Speech Sciences (AISV, http://www.aisv.it)

    Linking dei contenuti multimediali tra ontologie multilingui: i verbi di azione tra IMAGACT e BabelNet

    No full text
    We present a study dealing with the linking between two multilingual and multimedial resources, BabelNet and IMAGACT. The task aims to connect the videos contained in the IMAGACT Ontology of Actions and the related verb entries in BabelNet. The linking experiment is based on an algorithm that exploits the lexical information of the two resources. The results show that is possible to achieve an extensive linking between the two ontologies. This linking is highly desirable in order to build a rich multimedial knowledge base that can be exploited for the following complex tasks: the reference disambiguation and the automatic/assisted translation of both the verbs and the sentences which refer to actions.Lo studio qui presentato riguarda il collegamento tra due risorse multilingui e multimediali, BabelNet e IMAGACT. In particolare, l’esperimento di linking ha come oggetto i video dell’ontologia dell’azione IMAGACT e le rispettive entrate lessicali verbali di BabelNet. Il task e stato` eseguito attraverso un algoritmo che opera sulla base delle informazioni lessicali presenti nelle due risorse. I risultati del linking mostrano che è possibile effettuare un collegamento estensivo tra le due ontologie. Tale collegamento è auspicabile nel senso di fornire una base di dati ricca e multimediale per i complessi task di disambiguazione del riferimento dei verbi di azione e di traduzione automatica e assistita delle frasi che li contengono

    Tipos de acciones inducidas a partir de características léxicas multilingües

    No full text
    This paper presents a vector representation and a clustering of action concepts based on lexical features extracted from IMAGACT, a multilingual and multimodal ontology of actions in which concepts are represented through video prototypes. We computed vectors for 1,010 action concepts, where the dimensions correspond to verbs in 10 languages. Finally, an unsupervised clustering method has been applied on these data in order to discover action classes based on typological closeness. Those clusters are not language-specific or language-biased, and thus constitute an inter-linguistic classification of action domain.Este artículo presenta una representación vectorial y un clúster de conceptos de acción basados en características léxicas extraídas de IMAGACT, una ontología de acciones multilingüe y multimodal en la que los conceptos se representan a través de prototipos de video. Calculamos vectores para 1.010 conceptos de acción, donde las dimensiones corresponden a verbos en 10 idiomas. Finalmente, se ha aplicado un método de agrupación no supervisada en estos datos para descubrir clases de acción basadas en la proximidad tipológica. Esos clústers no son específicos del idioma ni están sesgados por él, y por lo tanto constituyen una clasificación interlingüística del dominio de acción

    Linking IMAGACT ontology to BabelNet through action videos

    No full text
    Herein we present a study dealing with the linking of two multilingual and multimedia resources, BabelNet and IMAGACT, which seeks to connect videos contained in the IMAGACT Ontology of Actions with related action concepts in BabelNet. The linking is based on a machine learning algorithm that exploits the lexical information of the two resources. The algorithm has been firstly trained and tested on a manually annotated dataset and then it was run on all the data, allowing to connect 773 IMAGACT action videos with 517 BabelNet synsets. This linkage aims to enrich BabelNet verbal entries with a visual representations and to connect the IMAGACT ontology to the huge BabelNet semantic network.In questo articolo si presenta uno studio sul linking tra due risorse linguistiche multilingui e multimediali, BabelNet e IMAGACT. L’esperimento ha l’obiettivo di collegare i video dell’ontologia dell’azione IMAGACT con i concetti azionali contenuti in BabelNet. Il collegamento è realizzato attraverso un algoritmo di Machine Learning che sfrutta l’informazione lessicale delle due risorse. L’algoritmo è stato addestrato e valutato su un dataset annotato manualmente e poi eseguito sull’insieme totale dei dati, permettendo di collegare 773 video di IMAGACT con 517 synset di BabelNet. Questo linking ha lo scopo di arricchire le entrate verbali di BabelNet con una rappresentazione visuale e di collegare IMAGACT alla rete semantica di BabelNet

    An NLP pipeline as assisted transcription tool for speech therapists

    No full text
    This work presents the design of a computer assisted transcription system for speech-language therapists and an evaluation of its core-module: the NLP pipeline. This pipeline combines a tokenizer, a lemmatizer, a part-of-speech tagger and a spellchecker in order to perform a semi-automatic annotation of speech transcriptions. The implemented module has been evaluated on a corpus of spoken interaction of children with Developmental Language Disorder (DLD) with the caregiver. Results are promising in automatic error detection (F-measure of 0.547 against a Ground Truth of 0.616) but low in automatic error correction, and confirm the effectiveness within an assisted transcription tool

    An NLP pipeline as assisted transcription tool for speech therapists

    No full text
    This work presents the design of a computer assisted transcription system for speech-language therapists and an evaluation of its core-module: the NLP pipeline. This pipeline combines a tokenizer, a lemmatizer, a part-of-speech tagger and a spellchecker in order to perform a semi-automatic annotation of speech transcriptions. The implemented module has been evaluated on a corpus of spoken interaction of children with Developmental Language Disorder (DLD) with the caregiver. Results are promising in automatic error detection (F-measure of 0.547 against a Ground Truth of 0.616) but low in automatic error correction, and confirm the effectiveness within an assisted transcription tool
    corecore